89 research outputs found
Bag-of-Words as Target for Neural Machine Translation
A sentence can be translated into more than one correct sentences. However,
most of the existing neural machine translation models only use one of the
correct translations as the targets, and the other correct sentences are
punished as the incorrect sentences in the training stage. Since most of the
correct translations for one sentence share the similar bag-of-words, it is
possible to distinguish the correct translations from the incorrect ones by the
bag-of-words. In this paper, we propose an approach that uses both the
sentences and the bag-of-words as targets in the training stage, in order to
encourage the model to generate the potentially correct sentences that are not
appeared in the training set. We evaluate our model on a Chinese-English
translation dataset, and experiments show our model outperforms the strong
baselines by the BLEU score of 4.55.Comment: accepted by ACL 201
Decoding-History-Based Adaptive Control of Attention for Neural Machine Translation
Attention-based sequence-to-sequence model has proved successful in Neural
Machine Translation (NMT). However, the attention without consideration of
decoding history, which includes the past information in the decoder and the
attention mechanism, often causes much repetition. To address this problem, we
propose the decoding-history-based Adaptive Control of Attention (ACA) for the
NMT model. ACA learns to control the attention by keeping track of the decoding
history and the current information with a memory vector, so that the model can
take the translated contents and the current information into consideration.
Experiments on Chinese-English translation and the English-Vietnamese
translation have demonstrated that our model significantly outperforms the
strong baselines. The analysis shows that our model is capable of generating
translation with less repetition and higher accuracy. The code will be
available at https://github.com/lancopk
Autoencoder as Assistant Supervisor: Improving Text Representation for Chinese Social Media Text Summarization
Most of the current abstractive text summarization models are based on the
sequence-to-sequence model (Seq2Seq). The source content of social media is
long and noisy, so it is difficult for Seq2Seq to learn an accurate semantic
representation. Compared with the source content, the annotated summary is
short and well written. Moreover, it shares the same meaning as the source
content. In this work, we supervise the learning of the representation of the
source content with that of the summary. In implementation, we regard a summary
autoencoder as an assistant supervisor of Seq2Seq. Following previous work, we
evaluate our model on a popular Chinese social media dataset. Experimental
results show that our model achieves the state-of-the-art performances on the
benchmark dataset.Comment: accepted by ACL 201
A Deep Reinforced Sequence-to-Set Model for Multi-Label Text Classification
Multi-label text classification (MLTC) aims to assign multiple labels to each
sample in the dataset. The labels usually have internal correlations. However,
traditional methods tend to ignore the correlations between labels. In order to
capture the correlations between labels, the sequence-to-sequence (Seq2Seq)
model views the MLTC task as a sequence generation problem, which achieves
excellent performance on this task. However, the Seq2Seq model is not suitable
for the MLTC task in essence. The reason is that it requires humans to
predefine the order of the output labels, while some of the output labels in
the MLTC task are essentially an unordered set rather than an ordered sequence.
This conflicts with the strict requirement of the Seq2Seq model for the label
order. In this paper, we propose a novel sequence-to-set framework utilizing
deep reinforcement learning, which not only captures the correlations between
labels, but also reduces the dependence on the label order. Extensive
experimental results show that our proposed method outperforms the competitive
baselines by a large margin
Global Encoding for Abstractive Summarization
In neural abstractive summarization, the conventional sequence-to-sequence
(seq2seq) model often suffers from repetition and semantic irrelevance. To
tackle the problem, we propose a global encoding framework, which controls the
information flow from the encoder to the decoder based on the global
information of the source context. It consists of a convolutional gated unit to
perform global encoding to improve the representations of the source-side
information. Evaluations on the LCSTS and the English Gigaword both demonstrate
that our model outperforms the baseline models, and the analysis shows that our
model is capable of reducing repetition.Comment: Accepted by ACL 201
Semantic-Unit-Based Dilated Convolution for Multi-Label Text Classification
We propose a novel model for multi-label text classification, which is based
on sequence-to-sequence learning. The model generates higher-level semantic
unit representations with multi-level dilated convolution as well as a
corresponding hybrid attention mechanism that extracts both the information at
the word-level and the level of the semantic unit. Our designed dilated
convolution effectively reduces dimension and supports an exponential expansion
of receptive fields without loss of local information, and the
attention-over-attention mechanism is able to capture more summary relevant
information from the source context. Results of our experiments show that the
proposed model has significant advantages over the baseline models on the
dataset RCV1-V2 and Ren-CECps, and our analysis demonstrates that our model is
competitive to the deterministic hierarchical models and it is more robust to
classifying low-frequency labels.Comment: EMNLP 201
Deconvolution-Based Global Decoding for Neural Machine Translation
A great proportion of sequence-to-sequence (Seq2Seq) models for Neural
Machine Translation (NMT) adopt Recurrent Neural Network (RNN) to generate
translation word by word following a sequential order. As the studies of
linguistics have proved that language is not linear word sequence but sequence
of complex structure, translation at each step should be conditioned on the
whole target-side context. To tackle the problem, we propose a new NMT model
that decodes the sequence with the guidance of its structural prediction of the
context of the target sequence. Our model generates translation based on the
structural prediction of the target-side context so that the translation can be
freed from the bind of sequential order. Experimental results demonstrate that
our model is more competitive compared with the state-of-the-art methods, and
the analysis reflects that our model is also robust to translating sentences of
different lengths and it also reduces repetition with the instruction from the
target-side context for decoding.Comment: Accepted by COLING 201
Recoil-ion momentum spectroscopy of photoionization of cold rubidium atoms in a strong laser field
We study photoionization of cold rubidium atoms in a strong infrared laser
field using a magneto-optical trap (MOT) recoil ion momentum spectrometer.
Three types of cold rubidium target are provided, operating in two-dimension
(2D) MOT, 2D molasses, and 3D MOT with densities in the orders of
atoms/cm, atoms/cm, and atoms/cm, respectively. The
density profile and the temperature of 3D MOT are characterized using the
absorption imaging and photoionization. The momentum distributions of Rb
created by absorption of two- or three-photon illuminate a dipole-like
double-peak structure, in good agreement with the results in the strong field
approximation. The yielding momentum resolution of a.u. is
achieved in comparison with theoretical calculations, exhibiting the great
prospects for the study of electron correlations in alkali metal atoms through
interaction with strong laser pulses
Disentangling the role of laser coupling in directional breaking of molecules
The directional control of molecular dissociation with the laser electric
field waveform is a paradigm and was demonstrated for a variety of molecules.
In most cases, the directional control occurs via a dissociative ionization
pathway. The role of laser-induced coupling of electronic states in the
dissociating ion versus selective ionization of oriented neutral molecules,
however, could not be distinguished for even small heteronuclear molecules such
as CO. Here, we introduce a technique, using elliptically polarized pump and
linearly polarized two-color probe pulses that unambiguously distinguishes the
roles of laser-induced state coupling and selective ionization. The measured
photoelectron momentum distributions governed by the light polarizations allow
us to coincidently identify the ionization and dissociation from the pump and
probe pulses. Directional dissociation of CO+ as a function of the relative
phase of the linearly polarized two-color pulse is observed for both parallel
and orthogonally oriented molecules. We find that the laser-induced coupling of
various electronic states of CO+ plays an important role for the observed
directional bond breaking, which is verified by quantum calculations.Comment: 7 pages, 6 figure
Momentum spectroscopy for multiple ionization of cold rubidium in the elliptically polarized laser field
Employing recent developed magneto-optical trap recoil ion momentum
spectroscopy (MOTRIMS) combining cold atom, strong laser pulse, and ultrafast
technologies, we study momentum distributions of the multiply ionized cold
rubidium (Rb) induced by the elliptically polarized laser pulses (35 fs, W/cm). The complete vector momenta of Rbn+ ions up to
charge state n = 4 are recorded with extremely high resolution (0.12 a.u. for
Rb). Variations of characteristic multi-bands displayed in momentum
distributions, as the ellipticity varies from the linear to circular
polarization, are interpreted qualitatively with the classical over-barrier
ionization model. Present momentum spectroscopy of cold heavy alkali atoms
presents novel strong-field phenomena beyond the noble gases
- …